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In experiments, the free energy of transferring the peptide group from water to an osmolyte solu- 
tion is obtained using the transfer free energy of (Gly) n with the added assumption that a constant 
incremental change in free energy with n implies that each additional unit makes an independent 
contribution to the free energy. Here we test this assumption and uncover its limitations. Together 
with results for cyclic-diglycine, we show that, in principle, it is not possible to obtain a peptide 
group transfer free energy that is independent of the model system. We calculate the hydration free 
energy, pi° x , of acetyl-(Gly)„-methyl amide (n — 1 ... 7) peptides modeled in the extended confor- 
mation in water and osmolyte solutions. pi° x versus n is linear, suggestive of independent, additive 
group-contributions. To probe the observed linearity further, we study the hydration of the solute 
bereft of water molecules in the first hydration shell. This conditioned solute arises naturally in the 
theoretical formulation and helps us focus on hydration effects uncluttered by the complexities of 
short-range solute- water interactions. We subdivide the conditioned solute into n + 1 peptide groups 
and a methyl end group. The binding energy of each of these groups with the solvent is Gaussian 
distributed, but the near neighbor binding energies are themselves correlated: the i,i + l correlation 
is the strongest and tends to lower the free energy over the independent group case. We show that 
the observed linearity can be explained by the similarity of near neighbor correlations. Implications 
for group additive transfer free energy models are indicated. 

Keywords: potential distribution theorem, regularization, protein hydration, molecular dynamics 



I. INTRODUCTION 

Group additive decomposition of the free energy of pro- 
tein conformational change has a rich history in attempts 
to understand the physical factors governing protein sta- 
bility in solution [l| . Such efforts are at the heart of past 
and current efforts to understand how the solvent mod- 
ulates protein folding thermodynamics Since the 
peptide bond is the most numerous group in a protein, 
attempts to obtain the transfer free energy of the pep- 
tide group have occupied a particularly important posi- 
tion in the broader attempt to understand the role of the 
solution in protein folding thermodynamics @, In- 
deed, such group additive transfer free energy analysis 
has been instrumental in revealing that conformation- 
protecting osmolytes primarily exert their influence by 
changing the solubility of the peptide backbone [1, , an 
identification with significant consequences to our under- 
standing of protein folding [l(|. However, a clear theoret- 
ical analysis of the meaning of the transfer free energy of 
the peptide group that apparently obeys group-additivity 
has not been satisfactorily established. Here we address 
this issue on the basis of a physically transparent theo- 
retical framework and computer simulations. However, 
the insights from this work are not limited to a peptide 
group, but apply more broadly to all such group-additive 
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decompositions of free energy that are used in studying 
protein folding and stability. 

In seeking the contribution of the peptide, starting 
from the seminal work of Nozaki and Tanford |2fl, it is 
common practice to consider the transfer free energy 
(typically from water to an aqueous solution of an ad- 
ditive) of glycyl-peptides of increasing chain length. The 
peptides can be blocked n-acetylglycinamides (as in this 
study) or zwitterionic (as in the studies by Nozaki and 
Tanford [2[). The transfer free energy of the peptide 
group then has been sought by considering the differ- 
ence in transfer free energy of chains of length m and 
n (m > n) by various constructs; for example, for 
m = n+1, the peptide free energy would be equated with 
the free energy difference between the chains of length m 
and n. A somewhat more robust approach termed the 
constant increment method equates the peptide transfer 
free energy to the slope of the transfer free energy with 
respect to n. Various such constructs are possible and 
these have been well-documented by Auton and Bolcn 
0. 

Work by Auton and Bolen Q has also helped clarify 
some of the vexing issues related to the choice of concen- 
tration scales and model compounds in determining the 
peptide group transfer free energy By careful con- 
sideration of peptide solubility issues, these authors have 
showed that reasonably concordant values of the peptide 
transfer free energy can be obtained that are indepen- 
dent of the concentration scale and of the model sys- 
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tern — glycyl-peptides versus cGG, the cyclic-diglycine 
molecule. (For cGG, he peptide transfer free energy is 
sought by dividing the experimental transfer free ener- 
gies by 2.) While this concordance is pleasing, it is also 
somewhat troubling since in cGG the CO and NH of the 
peptide are cis and the molecule has a net zero dipole mo- 
ment, whereas in all usual proteins the CO and NH are 
trans and the glycyl peptides used in the experiments can 
have non-negligible dipole moment. Further, the <f>, ip an- 
gles in cGG are also not consistent with what is found in 
proteins (George Rose, personal communication). Thus, 
either the conformation of the peptide is unimportant in 
the transfer free energy value or there are other effects 
that lead to this result or a combination of both. 

Concerns about group additivity fill ], and in partic- 
ular, identifying a group additive contribution for the 
peptide [HI, [l3| , are not new. Using a continuum dielec- 
tric model of the solvent, Avbelj and Baldwin [HI Il3j 
have argued that failure of group additivity arises due to 
dependence of the hydration free energy of the peptide 
on the neighboring groups (that serve to occlude the sol- 
vent medium in their model). Our point complements 
this, but is more broader. We show that an independent 
group additive contribution is not a consequence even 
when the conditions for use of the constant increment 
approach are satisfied. Moreover, both electrostatics and 
van der Waals (dispersion) interactions contribute to the 
failure of independence, each in rather subtle ways. Thus 
care is needed even in decoupling free energy contribu- 
tions of nonpolar groups from adjacent polar groups, an 
issue that had been anticipated before Ref. |14| . 

Here we use theory and computer simulations to ex- 
amine the vacuum to solution (S) transfer free energy of 
Acetyl-(Gly)„-methyl amide peptides and of cGG. The 
free energies are obtained by a quasichemical organiza- 
tion of the potential distribution theorem. A virtue of 
this formulation is that is that it makes transparent the 
role of correlated fluctuations of the binding energies of 
two groups on the molecule and its role in the thermody- 
namics of hydration. A central observation of our work 
is that even for an idealized solute with no complicated 
short-range solute-solvent interaction, the group-solvent 
binding energies between neighboring groups are corre- 
lated. This implies that identifying a group-contribution 
to free energy solely due to an individual group is, in 
principle, not possible, even for this idealized solute. The 
situation for a real solute is expected to be considerably 
more complicated. 

II. THEORY 

The excess chemical potential, /i cx , of a solute in the 
solution is that part of the Gibbs free energy that would 
vanish if the interaction between the solute and solvent 
were to vanish. Formally, 

= In / e P £ P(s)de , (1) 



where e — Us+i — Us — U p is the binding energy of the 
solute with the rest of the fluid. Us+i is the potential 
energy of the solute plus solvent system at a particu- 
lar configuration of the solvent (we assume the peptide 
conformation to be fixed), Us is the potential energy of 
the same configuration but with the solute removed, and 
Up is the potential energy of the solute, here the pep- 
tide, solely. P(e) is the probability density distribution 
of e. fi cx is the excess free energy in the liquid relative 
to an ideal gas at the same density and temperature. As 
usual, /3 — 1/fceT, where T is the temperature and /cb 
the Boltzmann constant. 

Following earlier work, to calculate /Lt ex from Eq. [TJ we 
regularize P{s) by introducing an auxiliary constraint, 
a field <f)\ that pushes the solvent molecules away from 
the solute's surface to a range A. This construct has the 
virtue of tempering the solute-solvent interaction, and, 
for solvent pushed far enough (typically evacuating the 
first hydration shell is sufficient), the distribution of bind- 
ing energies is a Gaussian. Formally, with the introduc- 
tion of the field, 
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— \tixq[4>\\ is the free energy required to apply the field 
in the solute-solvent system: it reflects the strength of 
the solute interaction with the solvent in the inner shell. 

— In is the free energy required to apply the field 
in the neat solvent system: it reflects the intrinsic prop- 
erties of the solvent. For (f)\ modeling a hard exclusion 
of solvent, — In p [4>\\ is precisely the hydrophobic con- 
tribution to hydration HEGJ]. /3/J. ex [P(e\<p\)} is the con- 
tribution to (3fi CK from long-range solute-solvent interac- 
tions. In molecular dynamics simulations, we calculate 

— ln:ro[(^A] or — lnpo[<Ax] simply by the work required to 
apply 4>\ [17| . Fig. [T] gives a schematic of the decompo- 
sition of ^ cx according to Eq. [2] 




FIG. 1. Schematic showing the physical pieces contributing 
to the solvation free energy (Eq. [2]) of the protein. This de- 
composition follows the regularization of the solute-solvent 
binding energy distribution. 

For excluding solvent from the first hydration (or 
inner-shell), the conditional binding energy distribution 
P(e\4>\) can be well-described by a Gaussian of mean 
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(s\<j>x) and variance {5e 2 \4>\) [18l - l2l| and the long-range 
contribution is then given by 



V cx [P(e\M] = (e\<t>x) + ^(5e 2 \<j>x) 



(3) 



Now consider decomposing e into contributions due 
to various groups i = 1, . . . , n comprising the solute un- 
der consideration. For a pairwise additive forcefield, as 
is used in this study, such a decomposition can be un- 
ambiguously made. For the conditioned solute, even 
the individual binding energy distributions P(ei\4>x) are 
Gaussian distributed, but, in general, &j is correlated 
with Ej (j ^ i). In this case, P(e = J2i £ i) ^ s Gaus- 
sian distributed with a mean ^2i(si\<px) and a variance 

i>j Pij J (^i\^x){Ss^x) , where pij is 

the correlation coefficient [3] . The long-range contribu- 
tion to the free energy is then given by 

M ox [P(e|^)] =]>> cx [P(£#a)] 

i 

+ / 3 EW< fc ^A)<fe||0A>, W 

i>j 

where ^ ex [P{ei\(j)x)] is described by Eq. [3] The second 
summation in Eq. 2] can be rewritten as a sum over all 
nearest neighbor pairs £ ii+1 , the next nearest pairs 
i+2> e ^ c - From the summation arranged in this fash- 
ion, we can then identify the effect of correlations at var- 
ious spatial length scales to the free energy [i 0X {P(e\(j)x)}- 
Note that the present formulation precisely identifies the 
contributions solely due to the individual groups, namely 
the quantities fj, ex [P(si\4>x)]', we shall call this the zeroth- 
order or self-contribution of the group i. For ease of pre- 
sentation, when we speak of, say, (i, i + 2) correlation, 
we mean the correlation between the binding energies of 
groups i and i + 2, respectively, with the solvent. 

We have pursued the above development for the long- 
range piece fi cx [P(e\4>x)], the solvation free energy of the 
conditioned solute, because in this case the binding en- 
ergy distribution is well-behaved. Conceptually a de- 
composition similar to Eq. [4] can also be sought for /j, ex 
(Eq. [2]), the net chemical potential of the solute. But 
in that case the functional form of the correlation con- 
tributions (which can be beyond linear-order) and even 
the individual contributions are, in general, difficult to 
ascertain. Our plan is to show that even for the idealized 
case of the conditioned solute, the effect of near-neighbor 
correlations is non-trivial, and hence the increment in 
/i° x [P(e|</>A)] with n is not solely a measure of the contri- 
bution of the individual group. That then implies that 
such a group transfer quantity will depend on the model 
system on which it was obtained. On this basis, given 
that fj, ex [P(e\(j)x)] is one component of [i cx and the greatly 
enhanced complexity of short-range solute-solvent effects, 
it is safe to conclude that the same conclusions hold for 
fi cx as well. 



III. RESULTS AND DISCUSSION 

A. Solvation of the physical solute 

Fig. [5] makes it clear that \x &x versus n for blocked 
(Gly)n obeys a linear dependence. Similar linearity also 
holds for the chemical, packing, and long-range contri- 
butions (Eq. [2]) individually. Per the constant increment 



•10 



o 



3. 



-20 



-30 



-40 



-50 



1 i 

O Water 

A TMAO 

O Urea 



V.N 



ah 



o 

-C N CH,- 

H 



V 



-N CH 3 



_L 



_L 



n 



FIG. 2. The solvation free energy (Eq.[2]) versus n for blocked 
(Gly)„. 

method 0, we consider the slope of these curves as the 
contribution of an individual group to the free energy. 
These values are collected in Table U together with re- 
sults for cGG. (Following established experimental ap- 
proach [7j, the value for cGG is divided by 2 to obtain 
the value for one CH 2 CONH group.) 



TABLE I. Peptide group transfer free energies from vacuum 
to solvent obtained from the slope of fi^ versus n. Values for 
cGG have been scaled by 1/2. Below each line for the model 
system studied, we present the transfer free energy values for 
transfer from water to the solution under study. All values are 
in kcal/mol. Standard error of the mean is about 0.1 kcal/mol 
(la). 

Water Urea TMAO 

(Gly)n -5.0 -5.4 ^BIT 
-0.4 
cGG/2 -6.2 -6.6 -6.2 
-0.4 

The water to the aqueous osmolyte transfer free energy 
agrees quite well for both the (Gly)„ and cGG models. 
The urea concentration is about 8 M and assuming a lin- 
ear dependence of transfer free energy on osmolyte con- 
centration [22|, [23| , we find that for 1 M urea solution, the 
transfer free energy is — 50 ± 13 cal/mol, a value that is 
in good agreement with experimental estimates Q- We 
find a net zero transfer free energy to aqueous TMAO 
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solution (4 M). This appears likely due to inadequacy in 
the forcefield model for TMAO [H [H. 

From Table UJ we can note that the good agreement in 
water to aqueous osmolyte solution transfer free energy 
masks the rather poor agreement in transfer free ener- 
gies from vacuum to the respective solution. While it 
can be argued that water to osmolyte solution transfer is 
the most relevant experimentally, our results suggest that 
these small values arise from differences of substantially 
large quantities. From the perspective of a physical the- 
ory, the vacuum to solution transfer quantities have the 
virtue of highlighting the role of inter-group correlations 
transparently, and in particular, Table UJ clearly shows 
that a peptide in (Gly)„ is different from a peptide in 
cGG, assuming the validity of the divide-by-2 construct, 
itself suspect for reasons noted in Sec. IIII Bl Based on 
analysis in Sec. IIII Bl it seems plausible that in the water 
to osmolyte transfer free, inter-group correlations involv- 
ing the (physical) solute cancel leaving a net change that 
is insensitive to the choice of the model system. 



B. Solvation of the conditioned solute 

As before (Fig. even the solvation free energy of 
the conditioned solute, ^[P(e\4>\)], depends linearly on 
n (Fig. [3]) . For the analysis below, we exclusively focus 
on the vacuum to water transfer. 
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FIG. 3. Long-range contribution to the free energy of blocked 
(Gly) n in water. The open circles are the simulation results. 
The filled circles are based on using the average values of the 
i and (i, i + 1) correlation contributions from the (Gly)7 chain 
(Table ITT1) to reconstruct the free energy for all other n. The 
(i, i + 1) correlation contribution between groups and 1 and 
between groups 7 and 8 for (Gly)7 is used to model similar 
end-group correlations for all other n. Likewise, zeroth-order 
contributions from groups 0, 1, and 8 from (Gly)7 are used 
for all n. 

For dissecting the correlation contributions to the 
slope, we focus on the internal groups of the peptide, 



as these are the ones changing with n. For the blocked 
(Gly)7 model, group is the methyl, group 1 is the 
CONHCH2 group formed between the acetyl group and 
the N-terminus of the protein, and Group 8 is the termi- 
nal CONHCH3 group. The remaining six (6) CONHCH 2 
groups are termed the internal groups. For the (Gly)3 
model, per this convention there are two internal groups. 
In Table [TT] we present the average contribution due 
to various orders of correlation between these internal 
groups. 



TABLE II. Average values of correlation contributions of var- 
ious orders per internal peptide unit, i indicates that only 
contribution of the group with the solvent is included (the 
first term on the right in Eq.[4]); i, i + 1 indicates of I s * neigh- 
bor correlation and so on. All values are in kcal/mol. For 
reference, note that the slope of the /i^ x [P(e|</!>A)] versus n 
curve (Fig. [3} is -2.23 kcal/mol. 

(Gly)7 (Gly) 3 
i -1.53 -1.58 

i,i + l -0.79 -0.78 
i,i + 2 0.25 
i,i + 3 -0.11 



Total -2.18 -2.37 

Notice that the contribution from the zeroth order (or 
self term) (i, Table [TTJ) is fairly different from the slope 
of the ^[P(e\(px)} versus n curve. As Eq. @] shows, this 
term — the summands in the first term on the right of 
Eq. 0] — is also the one that can be rigorously identified 
as a contribution solely due to the group. Progressively 
including contributions from (i, i + 1), (i, i + 2), etc. cor- 
relations, we find a sum that is reasonably close to the 
slope of n^[P(e\4>\)) versus n. (The slight discrepancy 
between the sum computed in Tabic [TT] and the slope 
arises because the linear fit is not perfect.) Observe that 
the contribution from various orders of correlation are 
fairly similar for (Gly)7 and (Gly)3. Likewise, the cor- 
relation of the end groups with the internal groups are 
also fairly similar for these two models (data not shown), 
implying insensitivity to chain length for the correlations 
involving long-range interactions. 

It proves insightful to consider how well the average 
values of various orders of correlation for (Gly)7 model 
the free energy for all other chain lengths. Towards this 
end, we take the average value of the + 1) correla- 
tion contribution from Table [TT1 the zeroth-order contri- 
butions for groups 0, 1, and 8, and the (i, i + 1) contribu- 
tion between groups and 1 and between groups 7 and 
8, and use Eq. 2] to compute the free energy for all n. 
The good agreement for all n, including n = 1 (which 
is all end- groups in our notation), reveals the underlying 
uniformity of these self (i) and nearest neighbor (i, i + 1) 
correlation in this model system. 

We can further appreciate the subtlety in these corre- 
lation contributions by identifying the electrostatic and 
dispersion contributions separately (Table llllj) . For dis- 
persion interactions, all orders of correlation beyond the 
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zeroth-order tend to elevate n^[P(e\(j>\)]. This makes 
sense since a favorable interaction of a water molecule 
with one center necessarily promotes a favorable inter- 
action of that water with an adjacent center and vice 
versa. (For the dispersion interactions, the relative ori- 
entation of the water molecule is irrelevant within the 
forcefield model.) For electrostatics, the relative orien- 



TABLE III. Net contribution due to correlations of various 
orders for (Gly)7. The column marked 'AH' gives the con- 
tribution with electrostatics and dispersion taken together. 
The columns marked 'Elec' and VdW give the contributions 
due to electrostatics and dispersion contributions separately. 
All values are in kcal/mol. For reference, the net free energy 
obtained by particle insertions is —21.6 kcal/mol. 

All Elec vdW 



i 

i,i + 2 
i,i + 3 
Total 



-17.0 
-5.4 
1.4 
-0.5 



3.0 
-6.4 

1.2 
-0.6 



-20.6 
0.7 
0.3 
0.0 



-21.5 -2.6 -19.6 



tations are important and near neighbor interactions are 
anti-correlated: a favorable interaction of water with one 
site necessarily comes at the price of a favorable inter- 
action with the adjacent site. For this same reason, the 
higher order electrostatic contributions tend to oscillate. 
Observe also that the sum of the electrostatic and vdW 
contributions in Tabic IIIII is not precisely equal to the 
value when these are taken together. This arises because 
these individual contributions to the binding energy are 
themselves correlated, and separating them is only ap- 
proximately true. Finally, consistent with Ref. [14j . we 
find that the binding energy of the methyl end group 
(group 0) is anti-correlated with the binding energy of 
group 1 (data not shown); this result together with the 
data in Table Hm thus suggests caution in decoupling po- 
lar and non-polar group contributions, especially if these 
groups are adjacent in space. 

Fig. S] shows the deviation in the calculated 
H™[P(e\4>\)) relative to the net free energy (left hand 
side of Eq. |4|) upon inclusion of increasing orders of cor- 
relation. It is evident that for (Gly)7 correlations up to 
i,i + 3 must be included to obtain a free energy that is 
converged. 

Table IIVI compares the average values of the various 
orders of correlation in the solvation of (Gly)7 in dif- 
ferent solvents. Remarkably, we find that all orders of 
correlation excluding the zeroth-order (self) contribution 
are identical. So at least in so far as the long-range in- 
teractions arc concerned, the difference in transfer free 
energy from water to the osmolyte solution can be en- 
tirely determined by the self-contribution, which is also 
the contribution that obeys group additivity. 

The above analysis clearly shows that the incremental 
change in fi°*[P(e\cf)\)] with respect to n includes factors 
beyond just the interaction of the added group with the 
solvent: additivity does not imply independence. 
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FIG. 4. S^[P{e\(j>x)\ is the deviation of ^[P{e\<j>x)] from 
the net free energy (left hand side of Eq. 2J. The inset is 
a schematic to indicate order of nearest neighbors, relative 
to the group labeled 0. Thus when no nearest neighbor is in- 
cluded, we sum only the individual group contributions (sum- 
mands in the first term on the right in Eq. [4]). Including the 
first neighbor means including all i, i + 1 contributions to the 
free energy as well. 



TABLE IV. Average values of various orders of correlation 
contributions per peptide unit for (Gly)7 in different solvents. 
Rest as in Table HT1 

Water Urea TMAO 



i 

i,i + 1 
i,i + 2 
i, i + 3 
Total 



-1.53 
-0.79 
0.25 
-0.11 



-1.97 
-0.79 
0.25 
-0.11 



-1.53 
-0.79 
0.25 
-0.11 



-2.18 -2.62 



-2.18 



Although it is not always stated explicitly, group-additive 
transfer free energy contributions are treated as indepen- 
dent contributions. Our analysis shows this is, in general, 
invalid. 

In contrast to our conclusion, the studies by Bolen and 
coworkers clearly show that the group-transfer model is 
capable of describing the m-value (the unfolding free en- 
ergy in an osmolyte solution minus that in water) in a 
near quantitative fashion 0, H| . From the perspective of 
vacuum to solvent transfer, the m-value is a difference of 
difference involving four large transfer free energy contri- 
butions (two each for the unfolded and folded states of 
the protein, respectively) and some degree of cancellation 
of errors can be expected. But the level of agreement 
between calculated- and experimental-m value 0, Q is 
remarkable and suggestive of some underlying physical 
regularity. Based on the similarity of higher-order cor- 
relation contributions for the conditioned solute in dif- 
ferent solvents, and the observed linearity of even pack- 
ing and chemistry contributions (data not shown), we 
suspect that in the m-value analysis, these higher order 
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effects cancef, feaving only the self (or group-additive) 
contribution intact. Exploring this idea further is left for 
future studies. 



IV. CONCLUDING DISCUSSION 

Group additivity has a hallowed place in chemistry; 
indeed it has even been referred to as the 4 th law of 
thermodynamics [Tl| . However, unlike a small molec- 
ular solute in the gas phase that is entirely characterized 
by strong, short-range interactions, in the treatment of a 
many body system, such as a protein in a solvent, char- 
acterized by many different scales of energies, ranging 
from strong, short-range interactions to relatively weak 
but fairly long-range interactions, group additive ideas 
must be considered with sufficient care. 

Even for an idealized solute with no short-range solute- 
solvent interaction, we find that the net solvation free 
energy of the solute comprises contributions due to the 
correlated interaction of the solvent with distinct groups 
in the solute. As is intuitively reasonable, the contri- 
bution of individual group-solvent interaction to the net 
free energy is the most dominant. However, the binding 
energy of a group i with the solvent is correlated with 
the binding energy of its neighbor i ± 1, i ± 2, etc. These 
correlated fluctuations can either raise or lower the free 
energy of the solute, and importantly, they can persist 
even for spatially distant groups. For the linear (Gly)7, 
correlations persist up to i, i+3. A similar behavior, per- 
haps even longer-range of correlations, can be expected 
for a topographically complicated object. Further, the in- 
dividual electrostatic and dispersion contributions to the 
binding energies are themselves correlated, making the 
identification of separate free energy contributions due 
to polar and nonpolar groups problematic, especially if 
those groups are spatially close. 

Given that our analysis uncovers non-negligible effects 
of correlations even for an idealized solute with no di- 
rect solute-solvent contact, we must expect substantially 
more involved correlation effects for a real solute, one 
that has an even more complicated short-range interac- 
tion with the solvent, and for solutes with formal charges, 
such as charged amino acid side-chain residues. In light 
of this identification, the physical basis for why such ad- 
ditive transfer models appear successful remains to be 
explained. A plausible solution, suggested by this work, 
may rest in the similarity of the correlation contributions 
between different solutions. 



V. METHODS 

The simulation procedure closely follows our earlier 
work [l7j and only key differences are noted. The pep- 
tides are modeled in the extended configuration with 
the long axis aligned with the diagonal of the simula- 
tion cell and the center at the center of the simulation 



cell. (Initial configurations were energy minimized with 
restraints to keep the peptide extended.) The peptide 
atoms are fixed in space throughout the simulation. The 
solvent was modeled by the TIP3P [25|, [26[ model and 
the CHARMM [27| forcefield with correction terms for 
dihedral angles |28[ was used for the protein. A total 
of 2006 TIP3 molecules solvated the protein. Parame- 
ters for urea and TMAO were obtained from Ref. [29[ 
and [30( | . respectively. A total of 449 urea molecules (for 
a molar concentration of about 8 M) and 195 TMAO 
molecules (for a molar concentration of about 4 M) were 
used. Unlike the earlier study [l7|, where the external 
field evacuated a spherical domain around the molecule, 
here we apply atom-centered fields to carve out a molec- 
ular cavity in the liquid; the functional form of the field 
was as before (Eq. 4b, Ref. 
to its eventual range of A 



13). To build the field 
5 A, we progressively ap- 
ply the field, and for every unit A increment in the 
range, we compute the work done in applying the field 
using Gauss-Legendre quadratures. Five Gauss-points 

(A = 0, ±(1/3)^/5 - 2^1077, ±(1/3)^5 + 2 V /Tn77) are 

chosen for each unit A. At each Gauss-point, the system 
was simulated for 1 ns and the data from the last 0.5 ns 
used for analysis. (Excluding more data did not change 
the numerical value, indicating good convergence. Er- 
ror analysis and error propagation was performed as lie- 
fore The starting configuration for each A point is 
obtained from the ending configuration of the previous 
point in the chain of states. For the packing contribu- 
tions, thus a total of 25 Gauss points span A S [0,5]. 
For the chemistry contribution, since solvent never en- 
ters A < 2.5 A, we simulate A G [2, 5] for a total of 15 
Gauss points. Separate calculations with a lower order 
Gauss-Legendre quadrature and a trapezoidal rule (with 
A incremented in steps of 0.1 A [13]) showed that results 
are very well converged with the five-point quadrature 
(data not shown). 

The long-range contribution n^\P(e\4>\] (A = 5 A) was 
obtained by inserting the solute [181 ] in a cavity (with 
atom-centered radius A = 5 A). 1500 equally spaced cav- 
ity configurations were obtained from the last 0.375 ns 
of a 1 ns simulation at A = 5 A. (The starting config- 
uration for the A = 5 A simulation was obtained from 
end point of the Gauss-Legendre procedure as indicated 
above.) We also did solute extraction calculations [2(3| in 
a like fashion, with 5000 binding energy values obtained 
over 0.5 ns of simulation. Confirming the Gaussian dis- 
tribution of binding energies, both procedures gave free 
energies to within 0.1 kcal/mol of each other (data not 
shown) . The binding energies for the correlation analysis 
were obtained from the solute extraction procedure. 

Cyclic-diglycine was built and optimized using the 
Gaussian (G09) quantum chemistry package [3l| . For 
consistency with the (Gly)„ simulations, the partial 
charges and Lennard- Jones interaction parameters were 
obtained from the backbone atoms of the CHARMM 
forcefield. 
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